Uncontrolled Format String
   HOME

TheInfoList



OR:

Uncontrolled format string is a type of
software vulnerability Vulnerabilities are flaws in a computer system that weaken the overall security of the device/system. Vulnerabilities can be weaknesses in either the hardware itself, or the software that runs on the hardware. Vulnerabilities can be exploited by ...
discovered around 1989 that can be used in
security exploit An exploit (from the English verb ''to exploit'', meaning "to use something to one’s own advantage") is a piece of software, a chunk of data, or a sequence of commands that takes advantage of a bug or vulnerability to cause unintended or unanti ...
s. Originally thought harmless, format string exploits can be used to
crash Crash or CRASH may refer to: Common meanings * Collision, an impact between two or more objects * Crash (computing), a condition where a program ceases to respond * Cardiac arrest, a medical condition in which the heart stops beating * Couch su ...
a program or to execute harmful code. The problem stems from the use of
unchecked user input Improper input validation or unchecked user input is a type of vulnerability in computer software that may be used for security exploits. This vulnerability is caused when " e product does not validate or incorrectly validates input that can affect ...
as the
format string The printf format string is a control parameter used by a class of functions in the input/output libraries of C and many other programming languages. The string is written in a simple template language: characters are usually copied literal ...
parameter in certain C functions that perform formatting, such as printf(). A malicious user may use the %s and %x format tokens, among others, to print data from the
call stack In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, program stack, control stack, run-time stack, or ma ...
or possibly other locations in memory. One may also write arbitrary data to arbitrary locations using the %n format token, which commands printf() and similar functions to write the number of bytes formatted to an address stored on the stack.


Details

A typical exploit uses a combination of these techniques to take control of the
instruction pointer The program counter (PC), commonly called the instruction pointer (IP) in Intel x86 and Itanium microprocessors, and sometimes called the instruction address register (IAR), the instruction counter, or just part of the instruction sequencer, is ...
(IP) of a process, for example by forcing a program to overwrite the address of a library function or the return address on the stack with a pointer to some malicious
shellcode In hacking, a shellcode is a small piece of code used as the payload in the exploitation of a software vulnerability. It is called "shellcode" because it typically starts a command shell from which the attacker can control the compromised m ...
. The padding parameters to format specifiers are used to control the number of bytes output and the %x token is used to pop bytes from the stack until the beginning of the format string itself is reached. The start of the format string is crafted to contain the address that the %n format token can then overwrite with the address of the malicious code to execute. This is a common vulnerability because format bugs were previously thought harmless and resulted in vulnerabilities in many common tools. MITRE's CVE project lists roughly 500 vulnerable programs as of June 2007, and a trend analysis ranks it the 9th most-reported vulnerability type between 2001 and 2006. Format string bugs most commonly appear when a programmer wishes to output a string containing user supplied data (either to a file, to a buffer, or to the user). The programmer may mistakenly write printf(buffer) instead of printf("%s", buffer). The first version interprets buffer as a format string, and parses any formatting instructions it may contain. The second version simply prints a string to the screen, as the programmer intended. Both versions behave identically in the absence of format specifiers in the string, which makes it easy for the mistake to go unnoticed by the developer. Format bugs arise because C's argument passing conventions are not
type-safe In computer science, type safety and type soundness are the extent to which a programming language discourages or prevents type errors. Type safety is sometimes alternatively considered to be a property of facilities of a computer language; that is ...
. In particular, the varargs mechanism allows
function Function or functionality may refer to: Computing * Function key, a type of key on computer keyboards * Function model, a structured representation of processes in a system * Function object or functor or functionoid, a concept of object-oriente ...
s to accept any number of arguments (e.g. printf) by "popping" as many
argument An argument is a statement or group of statements called premises intended to determine the degree of truth or acceptability of another statement called conclusion. Arguments can be studied from three main perspectives: the logical, the dialectic ...
s off the
call stack In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, program stack, control stack, run-time stack, or ma ...
as they wish, trusting the early arguments to indicate how many additional arguments are to be popped, and of what types. Format string bugs can occur in other programming languages besides C, such as Perl, although they appear with less frequency and usually cannot be exploited to execute code of the attacker's choice.


History

Format bugs were first noted in 1989 by the
fuzz testing Fuzz may refer to: * Fuzz (film), ''Fuzz'' (film), a 1972 American comedy * ''Fuzz: When Nature Breaks the Law'', a nonfiction book by Mary Roach * The fuzz, a List of slang terms for police officers, slang term for police officers Music * Fuzz ...
work done at the University of Wisconsin, which discovered an "interaction effect" in the
C shell The C shell (csh or the improved version, tcsh) is a Unix shell created by Bill Joy while he was a graduate student at University of California, Berkeley in the late 1970s. It has been widely distributed, beginning with the 2BSD release of the ...
(csh) between its
command history Command history is a feature in many operating system shells, computer algebra programs, and other software that allows the user to recall, edit and rerun previous commands. Command line history was added to Unix in Bill Joy's C shell of 1978 ...
mechanism and an error routine that assumed safe string input. The use of format string bugs as an
attack vector In computer security, an attack vector is a specific path, method, or scenario that can be exploited to break into an IT system, thus compromising its security. The term was derived from the corresponding notion of vector in biology. An attack ve ...
was discovered in September 1999 by Tymm Twillman during a
security audit An information security audit is an audit on the level of information security in an organization. It is an independent review and examination of system records, activities and related documents. These audits are intended to improve the level of in ...
of the
ProFTPD ProFTPD (short for ''Pro FTP daemon'') is an FTP server. ProFTPD is Free and open-source software, compatible with Unix-like systems and Microsoft Windows (via Cygwin). Along with vsftpd and Pure-FTPd, ProFTPD is among the most popular FTP serve ...
daemon. The audit uncovered an
snprintf The C programming language provides many standard library functions for file input and output. These functions make up the bulk of the C standard library header . The functionality descends from a "portable I/O package" written by Mike Lesk ...
that directly passed user-generated data without a format string. Extensive tests with contrived arguments to printf-style functions showed that use of this for privilege escalation was possible. This led to the first posting in September 1999 on the
Bugtraq Bugtraq was an electronic mailing list dedicated to issues about computer security. On-topic issues are new discussions about vulnerabilities, vendor security-related announcements, methods of exploitation, and how to fix them. It was a high-volume ...
mailing list regarding this class of vulnerabilities, including a basic exploit. It was still several months, however, before the security community became aware of the full dangers of format string vulnerabilities as exploits for other software using this method began to surface. The first exploits that brought the issue to common awareness (by providing remote root access via code execution) were published simultaneously on the
Bugtraq Bugtraq was an electronic mailing list dedicated to issues about computer security. On-topic issues are new discussions about vulnerabilities, vendor security-related announcements, methods of exploitation, and how to fix them. It was a high-volume ...
list in June 2000 by
Przemysław Frasunek Przemysław Frasunek (also known as venglin, born 6 May 1983) is a "white hat" hacker from Poland. He has been a frequent Bugtraq poster since late in the 1990s, noted for one of the first published successful software exploits for the format s ...
and a person using the nickname ''tf8''. They were shortly followed by an explanation, posted by a person using the nickname ''lamagra''. "Format bugs" was posted to the
Bugtraq Bugtraq was an electronic mailing list dedicated to issues about computer security. On-topic issues are new discussions about vulnerabilities, vendor security-related announcements, methods of exploitation, and how to fix them. It was a high-volume ...
list by Pascal Bouchareine in July 2000. The seminal paper "Format String Attacks" by
Tim Newsham Tim Newsham is a computer security professional. He has been contributing to the security community for more than a decade. He has performed research while working at security companies including @stake, Guardent, ISS, and Network Associates (or ...
was published in September 2000 and other detailed technical explanation papers were published in September 2001 such as ''Exploiting Format String Vulnerabilities'', by team Teso.


Prevention in compilers

Many compilers can statically check format strings and produce warnings for dangerous or suspect formats. In the GNU Compiler Collection, the relevant compiler flags are, -Wall,-Wformat, -Wno-format-extra-args, -Wformat-security, -Wformat-nonliteral, and -Wformat=2. Most of these are only useful for detecting bad format strings that are known at compile-time. If the format string may come from the user or from a source external to the application, the application must validate the format string before using it. Care must also be taken if the application generates or selects format strings on the fly. If the GNU C library is used, the -D_FORTIFY_SOURCE=2 parameter can be used to detect certain types of attacks occurring at run-time. The -Wformat-nonliteral check is more stringent.


Detection

Contrary to many other security issues, the root cause of format string vulnerabilities is relatively easy to detect in x86-compiled executables: For printf-family functions, proper use implies a separate argument for the format string and the arguments to be formatted. Faulty uses of such functions can be spotted by simply counting the number of arguments passed to the function; an 'argument deficiency' is then a strong indicator that the function was misused.


Detection in x86-compiled binaries

Counting the number of arguments is often made easy on x86 due to a calling convention where the caller removes the arguments that were pushed onto the stack by adding to the stack pointer after the call, so a simple examination of the stack correction yields the number of arguments passed to the printf-family function.'


See also

*
Cross-application scripting Cross-application scripting (CAS) is a vulnerability affecting desktop applications that don't check input in an exhaustive way. CAS allows an attacker to insert data that modifies the behaviour of a particular desktop application. This makes it p ...
exploits a similar kind of programming error *
Cross-site scripting Cross-site scripting (XSS) is a type of security vulnerability that can be found in some web applications. XSS attacks enable attackers to inject client-side scripts into web pages viewed by other users. A cross-site scripting vulnerability may ...
*
printf The printf format string is a control parameter used by a class of functions in the input/output libraries of C and many other programming languages. The string is written in a simple template language: characters are usually copied literal ...
*
scanf A scanf format string (''scan f''ormatted) is a control parameter used in various functions to specify the layout of an input string. The functions can then divide the string and translate into values of appropriate data types. String scannin ...
*
syslog In computing, syslog is a standard for message logging. It allows separation of the software that generates messages, the system that stores them, and the software that reports and analyzes them. Each message is labeled with a facility code, i ...
* Improper input validation *
SQL injection In computing, SQL injection is a code injection technique used to attack data-driven applications, in which malicious SQL statements are inserted into an entry field for execution (e.g. to dump the database contents to the attacker). SQL inj ...
is a similar attack that succeeds when input is not filtered


References


Further reading

* * * (vii+663 pages) *


External links


Introduction to format string exploits
2013-05-02, by Alex Reece * scut / team- TESObr>Exploiting Format String Vulnerabilities
v1.2 2001-09-09
WASC Threat Classification - Format String Attacks

CERT Secure Coding Standards

CERT Secure Coding Initiative

Known vulnerabilities
at MITRE's CVE project.
Secure Programming with GCC and GLibc
(2008), by Marcel Holtmann {{DEFAULTSORT:Format String Attack Computer security exploits